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VLSI  for  High-Speed  Digital  Signal  Processing 


Quarterly  Progress  Report  -  4/1/92  through  6/30/92 

During  the  present  quarter  we  have  made  some  architectural  changes  to  the  five- 
processor  ring-structured  programmable  digital  filter  IC  to  improve  the  system  per¬ 
formance.  The  main  change  is  to  insert  a  register  between  the  coefficient  RAM  and 
the  multiplier  to  eliminate  the  read  time  of  the  RAM  from  the  critical  path  of  the 
multiplier.  This  provides  a  substantial  improvement  in  the  system  performance  since 
the  critical  path  of  the  multiplier  determines  the  performance  of  the  over-all  five- 
processor  system.  Figure  1  shows  a  block  diagram  of  the  new  ALU  architecture. 
A  TinyChip  comprising  the  multiplier  with  the  redesigned  carry-select  vector-merge 
adder  described  in  the  previous  report  has  been  sent  to  MOSIS  for  fabrication.  The 
IC  consists  of  the  12-bit  by  11-bit  multiplier,  the  coefficient  and  data  input  registers 
(see  Figure  1),  the  output  register,  and  RAM  to  store  the  coefficient  and  input  data. 
The  multiplier  itself  consists  of  3100  transistors  occupying  an  area  of  1.53  mm2  (1.313 
mm  by  1.166  mm)  in  2-/zm  CMOS  technology  and  is  simulated  to  operate  in  22  ns. 
We  expect  the  prototype  parts  in  the  middle  of  the  next  quarter  for  testing.  Figure  2 
shows  the  architecture  of  the  TinyChip  submitted  for  fabrication  and  Figure  3  shows 
the  layout  of  this  IC. 

The  ALU  is  pipelined  into  two  sections:  the  multiplier  and  the  adder /subtracter 
(see  Figure  1).  Since  the  adder/subtracter  will  clearly  operate  faster  than  the  mul¬ 
tiplier,  we  can  skew  the  clock  to  the  register  at  the  output  of  the  multiplier  and 


the  clocks  at  the  input  and  output  of  the  ALU  to  allow  the  multiplier  additional 
time  to  perform  its  computation.  This  essentially  allows  the  multiply  operation  to 
extend  partially  into  the  cycle  allocated  for  the  addition/subtraction  so  long  as  the 
two  operations  are  complete  in  two  cycles.  Of  course,  in  order  to  prevent  race  con¬ 
ditions  the  skew  can  be  no  greater  than  the  minimum  clock-to-output  delay  of  the 
ALU  input  registers.  For  the  current  2-/zm  CMOS  design  this  is  approximately  2  ns, 
which  allows  the  cycle  time  for  the  system  to  be  reduced  to  20  ns.  Thus,  the  overall 
five-processor  system  can  be  expected  to  operate  at  approximately  50  MHz  in  2-/xm 
CMOS  technology. 

We  have  also  redesigned  the  dual-port  register  blocks  to  operate  at  50  MHz.  A 
TinyChip  consisting  of  a  16  by  16-bit  dual-port  register  block  is  currently  being 
submitted  to  MOSIS. 

We  have  just  completed  the  design  and  layout  of  the  program  storage  memory 
for  our  processors.  Each  processor  requires  a  memory  of  sixteen  32-bit  instructions. 
Program  data  can  be  stored  randomly,  but  can  only  be  accessed  sequentially.  The 
minimum  write  time  is  designed  to  be  200  ns  and  the  minimum  read  time  is  15.3  ns. 
We  will  fabricate  and  test  this  layout  during  the  next  quarter. 

The  following  journal  publication  citing  ONR  support  under  this  grant  has  ap¬ 
peared  during  the  present  quarter: 

A.  Y.  Kwentus,  M.  J.  Werter  and  A.  N.  Willson,  Jr.,  “A  Programmable  Digital 
Filter  IC  Employing  Multiple  Processors  on  a  Single  Chip,”  IEEE  Transactions 
on  Circuits  and  Systems  for  Video  Technology ,  vol.  2,  June  1992,  pp.  231-244. 


In  addition,  our  paper  describing  the  ring-structured  multiprocessor  chip  has  been 
accepted  for  the  IEEE  Workshop  on  VLSI  Signal  Processing,  to  be  held  in  Napa, 
California,  on  October  28-31,  1992.  Copies  of  the  paper  and  the  acceptance  notice 
are  enclosed. 
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Figure  1  -  ALU  Architecture 
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Figure  2  -  Block  Diagram  of  Multiplier  Test  IC 
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Figure  3  -  Layout  for  the  Multiplier  Test  IC 


